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IN THE CLAIMS: 

Please amend the claims as follows: 

1 . (Currently Amended) A computer-implemented method for identifying 
correlated columns from database tables, comprising: 

determining correlation attributes for a first column and a second column from 
one or more database tables, the correlation attributes describing for each column at 
least one of the column and content of the column; 

comparing the correlation attributes from the first and second column; 

identifying similarities between the first and second column on the basis of the 
comparison; 

on the basis of the identified similarities, determining whether the first and second 
column are correlated; and 

upon determining the first and second columns are correlated, merging the first 
and second columns to create a third column that contains each data value stored in the 
first and second columns , on l y i f th e co l umns ar e d e t e rm i n e d to b e corr el at e d ; and 

storing the third column in the database . 

2. (Original) The method of claim 1 , wherein identifying the similarities 
comprises: 

determining a correlation value indicating a degree of correlation between the 
first and the second column; and 

determining whether the correlation value exceeds a predetermined threshold. 

3. (Original) The method of claim 1 , further comprising: 

if it is determined that the first and second column are correlated, displaying an 
indication to a user that the first and second column can be merged; and 

in response to user input, merging the first and second column into a single 
column. 



Page 2 



PATENT 

App. Ser. No.: 10/829,624 
Atty. Dkt. No. ROC920030407US1 
PS Ref. No.: 1032.012912 (IBMK30407) 

4. (Original) The method of claim 1 , wherein the first column is a column of a 
first database table and the second column is a column of a second database table, the 
method further comprising: 

determining correlation attributes for N columns from the first database table and 
M columns from the second database table, where N and M are integers; 

comparing the correlation attributes from each of the N columns with the 
correlation attributes from each of the M columns to identify similarities between the N 
and M columns; and 

on the basis of the identified similarities, determining whether one or more of the 
N and M columns are correlated. 

5. (Original) The method of claim 4, further comprising merging each of the one 
or more of the N and M columns determined to be correlated. 

6. (Original) The method of claim 1 , further comprising: 
determining, from the one or more database tables, metadata describing 

characteristics of each column; and 

wherein the correlation attributes are determined on the basis of the determined 
metadata. 

7. (Original) The method of claim 6, wherein the determined metadata describes 
for each column an attribute of a data value in the column. 

8. (Original) The method of claim 6, wherein the determined metadata describes 
for each column at least one of: 

(i) a label; 

(ii) a comment; 

(iii) a constraint; 
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(iv) a trigger; 

(v) a name; 

(vi) a data type; and 

(vii) a column length. 

9. (Original) The method of claim 1 , further comprising: 
determining, from the one or more database tables, statistical parameters 

associated with each of the columns; and 

wherein the correlation attributes are determined on the basis of the determined 
statistical parameters. 

10. (Original) The method of claim 9, wherein the determined statistical 
parameters describe for each column at least one of: 

(i) a minimum value; 

(ii) a maximum value; 

(iii) an average value; and 

(iv) a range of values. 

1 1 . (Original) The method of claim 1 , further comprising: 
determining, from the one or more database tables, ontological properties 

describing cognitive qualities associated with each column; and 

wherein the correlation attributes are determined on the basis of the determined 
ontological properties. 

12. (Original) The method of claim 1 1 , wherein the determined ontological 
properties describe for each column at least one of: 

(i) a synonym; 

(ii) a parent node; and 

(iii) an ancestor node. 
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1 3. (Original) The method of claim 1 1 , further comprising: 

determining, from the one or more database tables, metadata describing the 
ontological properties. 

14. (Original) The method of claim 1 , further comprising: 
determining, from the one or more database tables, measurement units 

associated with each column; and 

wherein the correlation attributes are determined on the basis of the determined 
measurement units. 

15. (Original) The method of claim 14, further comprising: 

determining, from the one or more database tables, metadata describing the 
measurement units. 

16. (Original) The method of claim 14, wherein identifying the similarities 
comprises: 

determining whether the first and second column are associated with similar 
measurement units. 

17. (Currently Amended) A computer-implemented method for identifying 
correlated columns from database tables, comprising: 

determining metadata for at least two columns from one or more database 
tables, the metadata describing characteristics of each column; 

analyzing content from the at least two columns from the one or more database 
tables; an4 

determining a degree of correlation between the at least two columns using the 
determined metadata and the analyzed content ; and 

storing the value representing the degree of correlation in the database . 
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18. (Original) The method of claim 1 7, wherein determining the degree of 
correlation comprises: 

assigning a first correlation value to the determined metadata; 

assigning a second correlation value to the analyzed content, wherein the first 
and second correlation values are different; and 

calculating a total correlation value on the basis of the first and second 
correlation values. 

1 9. (Original) The method of claim 1 8, further comprising: 

merging the at least two columns if the total correlation value exceeds a 
predetermined threshold value. 

20. (Original) The method of claim 17, wherein analyzing the content comprises 
determining statistical parameters from the content of each column. 

21 . (Original) The method of claim 1 7, further comprising: 

merging the first and the at least one second column if it is determined that the 
first and at least one second column are correlated. 

22. (Currently Amended) A computer readable storage medium containing a 
program which, when executed, performs a process for identifying correlated columns 
from database tables, the process comprising: 

determining correlation attributes for a first column and a second column from 
one or more database tables, the correlation attributes describing for each column at 
least one of the column and content of the column; 

comparing the correlation attributes from the first and second column; 

identifying similarities between the first and second column on the basis of the 
comparison; 

on the basis of the identified similarities, determining whether the first and second 
column are correlated; and 
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upon determining the first and second columns are correlated, merging the first 
and second columns to create a third column that contains each data value stored in the 
first and second columns , on l y i f th e co l umns or e d e term i n e d to b e corr el at e d ; and 

storing the third column in the database . 

23. (Original) The computer readable medium of claim 22, wherein identifying the 
similarities comprises: 

determining a correlation value indicating a degree of correlation between the 
first and the second column; and 

determining whether the correlation value exceeds a predetermined threshold. 

24. (Original) The computer readable medium of claim 22, wherein the process 
further comprises: 

if it is determined that the first and second column are correlated, displaying an 
indication to a user that the first and second column can be merged; and 

in response to user input, merging the first and second column into a single 
column. 

25. (Original) The computer readable medium of claim 22, wherein the first 
column is a column of a first database table and the second column is a column of a 
second database table, the process further comprising: 

determining correlation attributes for N columns from the first database table and 
M columns from the second database table, where N and M are integers; 

comparing the correlation attributes from each of the N columns with the 
correlation attributes from each of the M columns to identify similarities between the N 
and M columns; and 

on the basis of the identified similarities, determining whether one or more of the 
N and M columns are correlated. 
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26. (Original) The computer readable medium of claim 25, wherein the process 
further comprises: 

merging each of the one or more of the N and M columns determined to be 
correlated. 

27. (Original) The computer readable medium of claim 22, wherein the process 
further comprises: 

determining, from the one or more database tables, metadata describing 
characteristics of each column; and 

wherein the correlation attributes are determined on the basis of the determined 
metadata. 

28. (Original) The computer readable medium of claim 27,wherein the 
determined metadata describes for each column an attribute of a data value in the 
column. 

29. (Original) The computer readable medium of claim 27, wherein the 
determined metadata describes for each column at least one of: 

(i) a label; 

(ii) a comment; 

(iii) a constraint; 

(iv) a trigger; 

(v) a name; 

(vi) a data type; and 

(vii) a column length. 

30. (Original) The computer readable medium of claim 22, wherein the process 
further comprises: 

determining, from the one or more database tables, statistical parameters 
associated with each of the columns; and 
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wherein the correlation attributes are determined on the basis of the determined 
statistical parameters. 

31 . (Original) The computer readable medium of claim 30, wherein the 
determined statistical parameters describe for each column at least one of: 

(i) a minimum value; 

(ii) a maximum value; 

(iii) an average value; and 

(iv) a range of values. 

32. (Original) The computer readable medium of claim 22, wherein the process 
further comprises: 

determining, from the one or more database tables, ontological properties 
describing cognitive qualities associated with each column; and 

wherein the correlation attributes are determined on the basis of the determined 
ontological properties. 

33. (Original) The computer readable medium of claim 32, wherein the 
determined ontological properties describe for each column at least one of: 

(i) a synonym; 

(ii) a parent node; and 

(iii) an ancestor node. 

34. (Original) The computer readable medium of claim 32, wherein the process 
further comprises: 

determining, from the one or more database tables, metadata describing the 
ontological properties. 

35. (Original) The computer readable medium of claim 22, wherein the process 
further comprises: 
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determining, from the one or more database tables, measurement units 
associated with each column; and 

wherein the correlation attributes are determined on the basis of the determined 
measurement units. 

36. (Original) The computer readable medium of claim 35, wherein the process 
further comprises: 

determining, from the one or more database tables, metadata describing the 
measurement units. 

37. (Original) The computer readable medium of claim 35, wherein identifying the 
similarities comprises: 

determining whether the first and second column are associated with similar 
measurement units. 

38. (Currently Amended) A computer readable storage medium containing a 
program which, when executed, performs a process for identifying correlated columns 
from database tables, the process comprising: 

determining metadata for at least two columns from one or more database 
tables, the metadata describing characteristics of each column; 

analyzing content from the at least two columns from the one or more database 
tables; and 

determining a degree of correlation between the at least two columns using the 
determined metadata and the analyzed content ; and 

storing the value representing the degree of correlation in the database . 

39. (Original) The computer readable medium of claim 38, wherein determining 
the degree of correlation comprises: 

assigning a first correlation value to the determined metadata; 
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assigning a second correlation value to the analyzed content, wherein the first 
and second correlation values are different; and 

calculating a total correlation value on the basis of the first and second 
correlation values. 

40. (Original) The computer readable medium of claim 39, wherein the process 
further comprises: 

merging the at least two columns if the total correlation value exceeds a 
predetermined threshold value. 

41 . (Original) The computer readable medium of claim 38, wherein analyzing the 
content comprises: 

determining statistical parameters from the content of each column. 

42. (Original) The computer readable medium of claim 38, wherein the process 
further comprises: 

merging the first and the at least one second column if it is determined that the 
first and at least one second column are correlated. 

43. (Currently Amended) A data processing system,, comprising: 
a processor: 

at least one database having one or more database tables; and 
a correlation manager for identifying correlated columns from the one or more 
database tables, the correlation manager which, when executed by the processor, is 
b ei ng configured for: 

determining correlation attributes for a first column and a second column 
from the one or more database tables, the correlation attributes describing for 
each column at least one of the column and content of the column; 

comparing the correlation attributes from the first and second column; 
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identifying similarities between the first and second column on the basis of 
the comparison; 

on the basis of the identified similarities, determining whether the first and 
second column are correlated; and 

upon determining the first and second columns are correlated, merging the 
first and second columns to create a third column that contains each data value 
stored in the first and second columns , on l y i f tho co l umns aro dotorm i nod to bo 
corTGl o tc d * 3nd 

storing the third column in the database . 



44. (Currently Amended) A data processing system., comprising: 
a processor; 

at least one database having one or more database tables; and 
a correlation manager for identifying correlated columns from the one or more 
database tables, the correlation manager which, when executed bv the processor, is 
be i ng configured for: 

determining metadata for at least two columns from the one or more 
database tables, the metadata describing characteristics of each column; 

analyzing content from the at least two columns from the one or more 
database tables; and 

determining a degree of correlation between the at least two columns using the 
determined metadata and the analyzed content ; and 

storing the value representing the degree of correlation in the database . 
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